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We present the results obtained from an all-sky search for gravitational-wave (GW) bursts in the 
64-2000 Hz frequency range in data collected by the LIGO detectors during the first year (November 
2005 - November 2006) of their fifth science run. The total analyzed livetime was 268.6 days. Multiple 
hierarchical data analysis methods were invoked in this search. The overall sensitivity expressed in 
terms of the root-sum-square (rss) strain amplitude h IBB for gravitational-wave bursts with various 
morphologies was in the range of 6 x 10~ 22 Hz -1 / 2 to a few xlO -21 Hz~ 1//2 . No GW signals were 
observed and a frequentist upper limit of 3.75 events per year on the rate of strong GW bursts was 
placed at the 90% confidence level. As in our previous searches, we also combined this rate limit 
with the detection efficiency for selected waveform morphologies to obtain event rate versus strength 
exclusion curves. In sensitivity, these exclusion curves are the most stringent to date. 

PACS numbers: 04.80. Nn, 07.05. Kf, 95.30.Sf, 95.85.Sz 



I. INTRODUCTION 

After many years of preparation, interferometric grav- 
itational wave (GW) detectors have now begun an era 
of long-duration observing. The three detectors of 
the Laser Interferometer Gravitational- Wave Observa- 
tory (LIGO) pQ reached their design sensitivity levels 
in 2005 and began a "science run" that collected data 
through late 2007. This run is called "S5" since it fol- 
lowed a sequence of four shorter science runs that be- 
gan in 2002. The German/British GEO600 detector [2 
joined the S5 run in January 2006, and the Italian/French 
Virgo detector [3] began its first science run (denoted 
VSR1) in May 2007, overlapping the last 4.5 months of 
the S5 run. The data collected by these detectors provide 
the best opportunity yet to identify a GW signal — though 
detection is still far from certain — and is a baseline for 
future coordinated data collection with upgraded detec- 
tors. 

Gravitational waves in the frequency band of LIGO 
and the other ground-based detectors may be produced 
by a variety of astrophysical processes [I] . See for exam- 
ple [5] for inspiralling compact binaries, [6] for spinning 
neutron stars, [7] for binary mergers, and [8j |9j QUI [H] 
for core-collapse supernovae. 



The GW waveform emitted by a compact binary sys- 
tem during the inspiral phase can be calculated accu- 
rately in many cases, allowing searches with optimal 
matched filtering; see, for example, [IS]. The waveform 
from the subsequent merger of two black holes is being 
modeled with ever-increasing success using numerical rel- 
ativity calculations, but is highly dependent on physi- 
cal parameters and the properties of strong-field gravity. 
The uncertainties for the waveforms of other transient 
sources are even larger. It is thus desirable to explore 
more generic search algorithms capable of detecting a 
wide range of short-duration GW signals from poorly- 
modeled sources — such as stellar core collapse to a neu- 
tron star or black hole — or unanticipated sources. As 
GW detectors extend the sensitivity frontier, it is impor- 
tant to not rely too heavily on assumptions about source 
astrophysics or about the true nature of strong- field grav- 
ity, and to search as broadly as possible. 

In this paper, we report on a search for GW "burst" 
signals in the LIGO data that were collected during the 
first 12 months of the S5 science run. A search for GW 
bursts in the remainder of the S5 data set, along with the 
Virgo VSR1 data, will be published jointly by the LSC 
and Virgo collaborations at a later date. 

The GW burst signals targeted are assumed to have 
signal power within LIGO's frequency band and dura- 
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tions shorter than ~ls, but are otherwise arbitrary. This 
analysis, like most of our previously published searches 
for GW bursts, focuses on low frequencies — in this case 
64 Hz to 2000 Hz — where the detectors are the most sen- 
sitive. A dedicated search for bursts above 2000 Hz is 
presented in a companion paper [13J . 

Interferometric GW detectors collect stable, high- 
sensitivity ("science mode") data typically for several 
hours at a time, with interruptions due to adverse envi- 
ronmental conditions, maintenance, diagnostics, and the 
need to occasionally regain the "locked" state of the servo 
controls. In this analysis we searched the data at all times 
when two or more LIGO detectors were operating, a de- 
parture from the all-sky GW burst searches from earlier 
science runs [HI [T5J [TSJ [TTJ [TB], which required coinci- 
dence among three (or more) detectors. In this paper, 
the term "network" is used to describe a set of detectors 
operating in science mode at a given time. A network 
may include any combination of the Hanford 4 km (HI) 
and 2 km (H2) detectors, the Livingston 4 km (LI) de- 
tector and GEO600. Because the GEO600 detector was 
significantly less sensitive than LIGO during the S5 run 
(a factor of 3 at 1000 Hz, and almost two orders of mag- 
nitude at 100 Hz), we do not use its data in the initial 
search but reserve it for evaluating any event candidates 
found in the LIGO data. 

This paper presents results from three different "anal- 
ysis pipelines" , each representing a complete search. 
While the pipelines analyzed the data independently, 
they began with a common selection of good-quality data 
and applied a common set of vetoes to reject identifiable 
artifacts. Each pipeline was tuned to maximize the sensi- 
tivity to simulated GW signals while maintaining a fixed, 
low false alarm rate. The tuning of the pipelines, the 
choice of good data and the decision on the veto proce- 
dure were made before looking at potential candidates. 

No GW signal candidates were identified by any of the 
analysis pipelines with the chosen thresholds. In order 
to interpret this non-detection, we evaluate the sensitiv- 
ity of each pipeline for simulated signals of various mor- 
phologies, randomly distributed over the sky and over 
time. As expected, there are some sensitivity differences 
among the pipelines, although the sensitivities rarely dif- 
fer by more than a factor of 2 (see section VII) and no 
single pipeline performs best for all of the simulated sig- 
nals considered. We combine the results of the pipelines 
to calculate upper limits on the rate of GW bursts as a 
function of signal morphology and strength. 

The rest of the paper is organized as follows: After 
specifying the periods of data, forming the first year of 



the S5 science run in Sec.|HJ Sec. HI describes the state of 
the detectors during that period. Section IV summarizes 
the elements of this GW burst search which are common 
to all of the analysis pipelines. The analysis pipelines 
themselves are detailed in Sec. |V| and Appendices C, D 
and E. Section [Vl| describes how each pipeline is tuned, 
while Sec. |VII| presents the sensitivity curves for simu- 
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lated signals and Sec. VIII describes the systematic er- 



FIG. 1: The top diagram indicates the mutually exclusive 
livetimes and duty cycles of different networks available for 
detection searches. The category 1 and 2 data quality flags 
(DQF) and vetoes described in Appendices A and B have 
been applied. The bottom diagram indicates the mutually 
exclusive livetimes and duty cycles of the different networks 
after category 3 DQF and vetoes have been applied to define 
the data set used to calculate upper limits. 



rors in these sensitivity curves. The results of the search 
are given in Sec. |IX[ and some discussion including esti- 
mates of the astrophysical reach for burst candidates in 
Sec. H 



II. S5 FIRST- YEAR DATA SET 

The search described in this paper uses data from ap- 
proximately the first calendar year of S5, specifically from 
November 4, 2005 at 16:00 UTC through November 14, 
2006 at 18:00 UTC. 

Figure [l] shows the amount of science-mode data col- 
lected ( "livctimc" ) for each mutually-exclusive network of 
detectors along with percentages of the experiment calen- 
dar duration (duty cycle). The top Venn diagram repre- 
sents the data with basic data quality and veto conditions 
(see Sec. IV and Appendices |A| and |B| , including 268.6 
days of data during which two or more LIGO detectors 
were in science mode; this is the sample which is searched 
for GW burst signals. An explicit list of the analyzed in- 
tervals after category 2 DQFs is available at pj)]. The 
bottom Venn diagram shows the livetimes after the appli- 
cation of additional data quality cuts and vetoes that pro- 
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FIG. 2: Representative sensitivities of the LIGO detectors 
during the first year of S5. These curves show the amplitude 
spectral density of LIGO noise converted to GW strain units. 



vide somewhat cleaner data for establishing upper limits 
on GW burst event rates. In practice, only the H1H2L1 
and H1H2 (not LI) networks — encompassing most of the 
livetime, 224 days are used to set upper limits. 



III. THE DETECTORS 
A. LIGO 

The high sensitivity (see Fig. [2]) and duty cycles (78.0% 
for HI, 78.5% for H2, and 66.9% for LI) achieved dur- 
ing the S5 run were the result of a number of improve- 
ments made prior to the run [201151]. The major changes 
were the successful operation at Livingston of a hydraulic 
external pre-isolator (HEPI) to suppress seismic distur- 
bances, and the implementation at both sites of a thermal 
compensation system (TCS) to reduce thermal lensing 
effects in the interferometer arm cavities due to optical 
absorption in mirror coatings and substrates. The HEPI 
system provides a reduction of the seismic noise by an 
order of magnitude in the band 0.2-2.0 Hz, and thus sig- 
nificantly improves the duty cycle of the LI detector. 

Another significant improvement was the extension of 
the wave-front sensing (WFS) subsystem to control all 
alignment degrees of freedom of the core interferometer 
optics, leading to significantly reduced alignment flucta- 
tions. Several improvements were made to the length 
sensing and control subsystem, enabling the photodetec- 
tors to take more power without saturation and thus al- 
lowing the laser power to be increased. A new method to 
calibrate the detectors was introduced, based on direct 
actuation of the test masses via radiation pressure from 
an auxiliary laser beam. Unlike the traditional coil-drive 
calibration method [22], which requires rather large test 
mass displacements, the new technique allows calibration 
of the detectors at a level closer to the anticipated signal 
strength. 



Other improvements included modifications to acous- 
tic and seismic isolation of optical tables with detection 
photodiodes, changes to the safety shutters to protect 
photodiodes from damage when interferometers fall out 
of lock, and improved detection of impending saturation 
of photodiodes to prevent lock losses. Finally, a number 
of physical effects which led to spurious transients and 
spectral lines in the data during previous science runs 
have been diagnosed and mitigated. 



B. GEO600 

The GEO600 detector, located near Hannover, Ger- 
many, was also operational during the S5 run, though 
with a lower sensitivity than the LIGO detectors. The 
GEO600 data were not used in the current study as the 
modest gains in the sensitivity to GW signals would not 
have offset the increased complexity of the analysis. The 
GEO600 data were held in reserve, and could have been 
used to follow up on detection candidates from the LIGO- 
only analysis. 

GEO600 began its participation in S5 on January 21, 
2006, operating in a night-and-weekend mode. In this 
mode, science data were acquired during nights and week- 
ends while commissioning work was performed during the 
day time. The commissioning work focused mainly on 
gaining a better understanding of the detector and im- 
proving data quality. It was performed in a manner that 
avoided disrupting science periods and allowed for well- 
calibrated data to be acquired. Between May 1 and Oc- 
tober 6, 2006, GEO600 operated in so-called 24/7-mode, 
during which the detector's duty cycle in science-mode 
operation was maximized and only very short mainte- 
nance periods took place. Overall in 24/7-mode an in- 
strumental duty cycle of about 95% and a science-mode 
duty cycle of more than 90% were achieved. GEO600 re- 
turned to night-and-weekend mode on October 16, 2006, 
and work began on further improving the reliability of 
the instrumentation and reducing the glitch rate. The 
detector was operated in night-and-weekend mode un- 
til the end of S5 in October 2007. Overall, GEO600 
collected about 415 days of well-calibrated and charac- 
terized science data in the period between January 2006 
and October 2007. 



IV. ANALYSIS PIPELINE OVERVIEW 

In this search for GW bursts, three independent end- 
to-end analysis pipelines have been used to analyze the 
data. These pipelines were developed and implemented 
separately, building upon many of the techniques that 
were used in previous searches for bursts in the SI, S2, 
S3 and S4 runs of LIGO and GEO600 pH QU El EH [22] , 
and prove to have comparable sensitivities (within a fac- 
tor of ~2; see Sec. VII I. One of these pipelines is fully 



coherent in the sense of combining data (amplitude and 
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phase) from all detectors and accounting appropriately 
for time delays and antenna responses for a hypotheti- 
cal gravitational-wave burst impinging upon the network. 
This provides a powerful test to distinguish GW signals 
from noise fluctuations. 

Here we give an overview of the basic building blocks 
common to all of the pipelines. The detailed operation 
of each pipeline will be described later. 



A. Data quality evaluation 

Gravitational-wave burst searches are occasionally af- 
fected by instrumental or data acquisition problems as 
well as periods of degraded sensitivity or nonstationary 
noise due to bad weather or other environmental condi- 
tions. These may produce transient signals in the data 
and/or may complicate the evaluation of the significance 
of other candidate events. Conditions which may ad- 
versely affect the quality of the data are catalogued dur- 
ing and after the run by defining "data quality flags" 
(DQFs) for lists of time intervals. DQFs are categorized 
according to their seriousness; some are used immedi- 
ately to select the data to be processed by the analysis 
pipelines (a subset of the nominal science-mode data), 
while others are applied during post-processing. These 
categories are described in more detail in Appendix A. 
In all cases the DQFs were defined and categorized before 
analyzing unshifted data to identify event candidates. 



B. Search algorithms 

Data that satisfies the initial selection criteria are 
passed to algorithms that perform the signal-processing 
part of the search, described in the following section and 
in three appendices. These algorithms decompose the 
data stream into a time-frequency representation and 
look for statistically significant transients, or "triggers" . 
Triggers are accepted over a frequency band that spans 
from 64 Hz to 2000 Hz. The lower frequency cut-off is 
imposed by seismic noise which sharply reduces sensi- 
tivity at low frequencies, while the upper cut-off corre- 
sponds roughly to the frequency at which the sensitivity 
degrades to the level found at the low frequency cut-off. 
(A dedicated search for bursts with frequency content 
above 2000 Hz is presented in a companion paper [13].) 



C. Event- by-event DQFs and vetoes 

After gravitational-wave triggers have been identified 
by an analysis pipeline, they are checked against addi- 
tional DQFs and "veto" conditions to see if they occurred 
within a time interval which should be excluded from the 
search. The DQFs applied at this stage consist of many 
short intervals which would have fragmented the data 



set if applied in the initial data selection stage. Event- 
by-event veto conditions are based on a statistical corre- 
lation between the rate of transients in the GW channel 
and noise transients, or "glitches", in environmental and 
intcrfcrometric auxiliary channels. The performance of 
vetoes (as well as DQFs) are evaluated by the extent to 
which they remove the GW channel transients of each in- 
terferometer, as identified by the Kleine Welle (KW) [53] 
algorithm. KW looks for excess signal energy by decom- 
posing a timeseries into the Haar wavelet domain. For 
each transient, KW calculates a significance defined as 
the negative of the natural logarithm of the probability, 
in Gaussian noise, of observing an event as energetic or 
more than the one in consideration. The veto conditions, 
like the DQFs, were completely defined before unshifted 
data was analyzed to identify gravitational-wave event 
candidates. A detailed description of the implementa- 
tion of the vetoes is given in Appendix B. 

D. Background estimation 

In order to estimate the false trigger rate from detec- 
tor noise fluctuations and artifacts, data from the var- 
ious detectors are artificially shifted in time so as to 
remove any coincident signals. These time-shifts have 
strides much longer than the intersite time-of-flight for 
a true gravitational-wave signal and thus are unlikely to 
preserve any reconstructable astrophysical signal when 
analyzed. We refer to these as time-shifted data. Both 
unshifted and time-shifted data are analyzed by identical 
procedures, yielding the candidate sample and the esti- 
mated background of the search, respectively. In order 
to avoid any biases, no unshifted data are used in the 
tuning of the methods. Instead, combined with simula- 
tions (see below), background data are used as the test 
set over which all analysis cuts are defined prior to ex- 
amining the unshifted data-set. In this way, our analyses 
are "blind". 



E. Hardware signal injections 

During the S5 run, simulated GW signals were occa- 
sionally injected into the data by applying an actuation 
to the mirrors at the ends of the interferometer arms. The 
waveforms and times of the injections were cataloged for 
later study. These were analyzed as an end-to-end val- 
idation of the interferometer readout, calibration, and 
detection algorithms. 

F. Simulations 

In addition to analyzing the recorded data stream 
in its original form, many simulated signals are in- 
jected in software — by adding the signal to the digi- 
tal data stream in order to to simulate the passage of 



7 



gravitational-wave bursts through the network of detec- 
tors. The same simulated signals are analyzed by all 
three analysis pipelines. This provides a means for es- 
tablishing the sensitivity of the search by measuring the 
probability of detection as a function of the signal mor- 
phology and strength. These will also be referred to as 
efficiency curves. 



V. SEARCH ALGORITHMS 

Unmodeled GW bursts can be distinguished from in- 
strumental noise if they show consistency in time, fre- 
quency, shape, and amplitude among the LIGO detec- 
tors. The time constraints, for example, follow from the 
maximum possible propagation delay between the Han- 
ford and Livingston sites which is 10ms. 

This S5 analysis employs three algorithms to search for 
GW bursts: BlockNormal [25], QPipeline [23 [22], and 
coherent WaveBurst [28] . A detailed description of each 
algorithm can be found in the appendices. Here we limit 
ourselves to a brief summary of the three techniques. All 
three algorithms essentially look for excess power [29] in a 
time-frequency decomposition of the data stream. Events 
are ranked and checked for temporal coincidence and co- 
herence (defined differently for the different algorithms) 
across the network of detectors. The three techniques 
differ in the details of how the time-frequency decompo- 
sitions are performed, how the excess power is computed, 
and how coherence is assessed. Each analysis pipeline 
was independently developed, coded and tuned. Because 
the three pipelines have different sensitivities to different 
types of GW signals and instrumental artifacts, the re- 
sults of the three searches can be combined to produce 
stronger statements about event candidates and upper 
limits. 

BlockNormal (BN) performs a time-frequency decom- 
position by taking short segments of data and applying 
a heterodyne basebanding procedure to divide each seg- 
ment into frequency bands. A change-point analysis is 
used to identify events with excess power in each fre- 
quency band for each detector, and events are clustered 
to form single-interferometer triggers. Triggers from the 
various interferometers that fall within a certain coinci- 
dence window are then combined to compute the "com- 
bined power", Pq, across the network. These coincident 
triggers are then checked for coherence using CorrPower, 
which calculates a cross-correlation statistic T that was 
also used in the S4 search [T7]. A detailed description of 
the BN algorithm can be found in Appendix [C] 

QPipeline (QP) performs a time-frequency decompo- 
sition by filtering the data against bisquare-enveloped 
sine waves, in what amounts to an over-sampled wavelet 
transform. The filtering procedure yields a standard 
matched filter signal to noise ratio (SNR), p, which is 
used to identify excess power events in each interferome- 
ter (quoted in terms of the quantity Z = p 2 /2). Triggers 
from the various interferometers are combined to give 



candidate events if they have consistent central times and 
frequencies. QPipeline also looks for coherence in the re- 
sponse of the HI and H2 interferometers by comparing 
the excess power of sums (the coherent combination H+) 
and differences (the null combination H— ) of the data. 
Rather than using the single-interferometer HI, H2, LI, 
signal to noise ratios, the QPipeline analysis uses the 
SNRs in the transformed channels H+, H— , and LI. A 
detailed description of the QPipeline algorithm can be 
found in Appendix [P] 

Coherent WaveBurst (cWB) performs a time- 
frequency decomposition using critically sampled Meyer 
wavelets. The cWB version used in S5 replaces the sep- 
arate coincidence and correlation test (CorrPower) used 
in the S4 analysis [17] by a single coherent search statis- 
tic based on a Gaussian likelihood function. Constrained 
waveform reconstruction is used to compute the network 
likelihood and a coherent network amplitude. This co- 
herent analysis has the advantage that it is not limited 
by the performance of the least sensitive detector in the 
network. In the cWB analysis, various signal combina- 
tions are used to measure the signal consistency among 
different sites: a network correlation statistic cc , net- 
work energy disbalance Anet, H1-H2 disbalance Ahh 
and a penalty factor Pt. These quantities are used in 
concert with the coherent network amplitude rj to de- 
velop efficient selection cuts that can eliminate spurious 
events with a very limited impact on the sensitivity. It 
is worth noting that the version of cWB used in the S5 
search is more advanced than the one used on LIGO and 
GEO data in S4 [15]. A detailed description of the cWB 
algorithm can be found in Appendix [E] 

Both QPipeline and coherent WaveBurst use the free- 
dom to form linear combinations of the data to construct 
"null streams" that are insensitive to GWs. These null 
streams provide a powerful tool for distinguishing be- 
tween genuine GW signals and instrument artifacts [3D]. 



VI. BACKGROUND AND TUNING 

As mentioned in Sec. IV, the statistical properties of 
the noise triggers (background) are studied for all net- 
work combinations by analyzing time-shifted data, while 
the detection capabilities of the search pipelines for var- 
ious types of GW signals are studied by analyzing simu- 
lated signals (described in the following section) injected 
into actual detector noise. Plots of the parameters for 
noise triggers and signal injections are then examined 
to tune the searches. Thresholds on the parameters are 
chosen to maximize the efficiency in detecting GWs for a 
predetermined, conservative false alarm rate of roughly 
5 events for every 100 time shifts of the full data set, i.e. 
^0.05 events expected for the duration of the data set. 

For a given energy threshold, all three pipelines ob- 
served a much larger rate of triggers with frequencies 
below 200 Hz than at higher frequencies. Therefore, each 
pipeline set separate thresholds for triggers above and 
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below 200 Hz, maintaining good sensitivity for higher- 
frequency signals at the expense of some sensitivity for 
low-frequency signals. The thresholds were tuned sepa- 
rately for each detector network, and the cWB pipeline 
also distinguished among a few distinct epochs with dif- 
ferent noise properties during the run. A more detailed 
description of the tuning process can be found in Appen- 
dices C, D, and E. 

VII. SIMULATED SIGNALS AND EFFICIENCY 
CURVES 

In this section we present the efficiencies of the different 
algorithms in detecting simulated GWs. As in previous 
science runs, we do not attempt to survey the complete 
spectrum of astrophysically motivated signals. Instead, 
we use a limited number of ad hoc waveforms that probe 
the range of frequencies of interest, different signal dura- 
tions, and different GW polarizations. 

We choose three families of waveforms: sine-Gaussians, 
Gaussians, and "white noise bursts". An isotropic sky 
distribution was generated in all cases. The Gaussian 
and sine-Gaussian signals have a uniformly distributed 
random linear polarization, while the white noise bursts 
contain approximately equal power in both polarizations. 
We define the amplitude of an injection in terms of the 
total signal energy at the Earth observable by an ideal 
optimally oriented detector able to independently mea- 
sure both signal polarizations: 

/+oo 
(\h + (t)\ 2 + \h x (t)f)dt (7.1) 
-OO 

/+oo 
(\h + (f)\ 2 + \hx(f)\ 2 )df. 
-oo 

In reality, the signal observed at an individual detec- 
tor depends on the direction Cl to the source and the 
polarization angle ^ through "antenna factors" F+ and 

h dct = F + (Cl,*)h+ + F x (Cl,^)h x . (7.2) 

In order to estimate the detection efficiency as a func- 
tion of signal strength, the simulated signals were in- 
jected at 22 logarithmically spaced values of h rsB ranging 
from 1.3 x 10- 22 Hz~ 1/2 to 1.8 x 10~ 19 Hz" 1/2 , stepping 
by factors of ~-^2. Injections were performed at quasi- 
random times regardless of data quality or detector state, 
with an average rate of one injection every 100 seconds. 
The efficiency of a method is then defined as the frac- 
tion of waveforms that are detected out of all that were 
injected into the data analyzed by the method. 

A. Simulated signals 

The first family of injected signals are sine-Gaussians. 
These are sinusoids with a central frequency /o, dimen- 



sionless width Q and arrival time to, defined by: 

h+(t +t) = h o sm(27rf Q t)exp[-(27rf t) 2 /2Q 2 }. (7.3) 

More specifically fo was chosen to be one of (70, 100, 
153, 235, 361, 554, 849, 945, 1053, 1172, 1304, 1451, 
1615, 1797, 2000) Hz; and Q to be one of 3, 9, or 100. 

The second family consists of Gaussian pulses de- 
scribed by the following expression: 

h+(t +t)=h exp(-i 2 /r 2 ) (7.4) 

where r is chosen to be one of (0.05, 0.1, 0.25, 0.5, 1.0, 
2.5, 4.0, 6.0, 8.0) ms. 

The third family are the "white noise bursts" ( WNBs) . 
These were generated by bandpassing white noise in fre- 
quency bands starting at 100 Hz, 250 Hz, or 1000 Hz, with 
bandwidth 10 Hz, 100 Hz, or 1000 Hz, and by time win- 
dowing with Gaussian profiles of duration (half of the 
interval between the inflection points) equal to 100 ms, 
10 ms, or 1 ms. For each waveform type (a choice of cen- 
tral frequency, bandwidth, and duration), 30 waveform 
files with random data content were created. The injec- 
tions for each waveform type use random pairs selected 
from the 30 created waveforms for the /i+ and h x polar- 
izations (the selection avoids pairs with identical wave- 
forms). This results in unpolarized injections with equal 
amounts of power on average in each polarization state. 

Each efficiency curve, consisting of the efficiencies de- 
termined for a given signal morphology at each of the 22 
h ISS values, was fitted with an empirical four-parameter 
function. The efficiency curves for the logical OR combi- 
nation of the three pipelines and for the combined H1H2 
and H1H2L1 networks are shown for selected waveforms 
in Figs. [3] and [il The h rss values yielding 50% detec- 
tion efficiency, h^®^ , are shown in Tables I and II for 
sine-Gaussians with Q = 9 and for white noise bursts in- 
jected and analyzed in H1H2L1 data. The study of the 
efficiency for all the waveforms shows that the combina- 
tion of the methods is slightly more sensitive than the 
best performing one, which is QPipeline for some of the 
sine-Gaussians, and cWB for all other waveforms consid- 
ered. 



VIII. STATISTICAL AND CALIBRATION 
ERRORS 

The /if£T° values presented in this paper have been ad- 
justed to conservatively reflect systematic and statistical 
uncertainties. The dominant source of systematic uncer- 
tainty is from the amplitude measurements in the fre- 
quency domain calibration. The individual amplitude 
uncertainties from each interferometer can be combined 
into a single uncertainty by calculating a combined root- 
sum-square amplitude SNR and propagating the individ- 
ual uncertainties assuming each error is independent. In 
addition, there is a small uncertainty (about 1%) intro- 
duced by converting from the frequency domain to the 
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25.9 


227.4 


33.1 
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14.0 
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235 
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6.3 
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6.8 


361 
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10.9 


11.2 
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12.0 
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15.5 


12.9 
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18.1 
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23.7 


19.2 


945 


9 


20.6 


21.6 


27.8 


22.2 


1053 


9 


23.3 


24.8 


33.4 


24.1 


1172 
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25.2 


26.8 


36.5 


26.3 


1304 


9 


28.7 


30.9 


40.8 


29.5 


1451 


9 


32.0 


35.0 


48.1 


32.9 


1615 


9 


35.2 


38.2 


51.5 


36.3 


1797 


9 


42.0 


44.2 


62.2 


45.4 


2000 


9 


54.5 


55.9 


77.6 


68.8 



TABLE I: h I3S values yielding 50% detection efficiency, in 
units of 10~ 22 Hz~ 1/2 , for different sine-Gaussian waveforms 
and pipelines in the H1H2L1 network. The first column is 
the central frequency, the second the quality factor, the third 
the h^s of the logical OR of the pipelines, and the remain- 
ing three columns the frrss of the individual pipelines. These 
ft™? values include an adjustment of 11.1% to take into ac- 
count calibration and statistical uncertainties as explained in 

sec. ivm 



/ (Hz) BW (Hz) d (ms) Combined cWB BN QP 
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51.8 
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1000 


1000 


0.01 
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73.0 
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44.6 
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14.1 
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10 
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9.1 
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13.7 
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250 


100 


0.01 


7.3 


7.6 


18.6 


8.5 


250 


100 


0.1 


8.8 


8.9 


11.6 


13.4 


250 


10 


0.1 


5.9 


5.9 


9.0 


17.6 



TABLE II: ft rss values yielding 50% detection efficiency, in 
units of 10 _22 Hz~ 1/2 , for different white noise burst wave- 
forms and pipelines in the H1H2L1 network. The first col- 
umn is the central frequency, the second the bandwidth, the 
third the duration of the gaussian window, the fourth the 
ftrss % of the logical OR of the pipelines, and the remaining 
three columns the of the individual pipelines. These 

h!^f values include an adjustment of 11.1% to take into ac- 
count calibration and statistical uncertainties as explained in 
Sec. rVTHl 
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FIG. 3: Combined efficiencies of the three pipelines and two 
networks (H1H2L1 and H1H2) used in the upper limit analysis 
for selected sine-Gaussian waveforms with (a) Q = 3, (b) 
Q = 9, (c) Q = 100. These efficiencies have been calculated 
using the logical OR of the pipelines and networks for the 
subset of simulated signals that were injected in time intervals 
that were actually analyzed, and thus approach unity for large 
amplitudes. 



time domain strain series on which the analysis was actu- 
ally run. There is also phase uncertainty on the order of a 
few degrees in each interferometer, arising both from the 
initial frequency domain calibration and the conversion 
to the time domain. However, this is not a significant 
concern since the phase uncertainties at all frequencies 
correspond to phase shifts on the order of less than half 
a sample duration. We therefore do not make any ad- 
justment to the overall systematic uncertainties due to 
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FIG. 4: Combined efficiency of the three pipelines and two 
networks (H1H2L1 and H1H2) used in the upper limit analysis 
for (a) selected linearly-polarized Gaussian waveforms; (b) se- 
lected band-limited white-noise bursts with two independent 
polarization components. These efficiencies have been calcu- 
lated using the logical OR of the pipelines and networks for 
the subset of simulated signals that were injected in time in- 
tervals that were actually analyzed, and thus approach unity 
for large amplitudes. 



and time-shifted (background) data, histograms of the 
two populations are generated for each pipeline, inter- 
ferometer network and frequency band. See for example 
trigger distributions for the H1H2L1 network in Figs. [5] 
[6] and [7] No unshifted triggers are found above threshold 
in the final sample for any of the three pipelines and four 
network configurations. We therefore have no candidate 
GW signals, and no follow up for possible detections is 
performed. We proceed to set upper limits on the rate of 
specific classes of GWs. 
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phase error. Finally, statistical uncertainties on the fit 
parameters (arising from the binomial errors on the effi- 
ciency measurements) affect ft.fss % by approximately 1.4% 
on average and are not much different for any particular 
waveform. 

The frequency-domain amplitude uncertainties arc 
added in quadrature with the other smaller uncertain- 
ties to obtain a total 1-sigma relative error for the SNR. 
The relative error in the h rss is then the same as the rel- 
ative error in the SNR. Thus, we adjust our sensitivity 
estimates by increasing the values by the reported 
percent uncertainties multiplied by 1.28 (to rescale from 
a 1-sigma fluctuation to a 90% confidence level upper 
limit, assuming Gaussian behavior), which amounts to 
11.1% in the frequency band explored in this paper. 



IX. SEARCH RESULTS 

Once category 2 DQFs have been applied on the trig- 
gers produced from the unshifted (i.e. candidate sample) 



FIG. 5: Distributions of cWB H1H2L1 triggers after category 
2 DQFs were applied. Overlaid histograms for 77 for unshifted 
triggers (dots) and mean background estimated from time- 
shifted triggers (stair-step curve). The narrow error bars in- 
dicate the statistical uncertainty of the background estimate, 
while the shaded band indicates the expected root-mean- 
square statistical fluctuations on the number of background 
triggers in each bin. The top panel represents the triggers 
with central frequency below 200 Hz while the bottom panel 
represents the triggers with central frequency above 200 Hz. 



A. Upper limits 

Our measurements consist of the list of triggers de- 
tected by each analysis pipeline (BN, QP, cWB) in each 
network data set (H1H2L1, H1H2, H1L1, H2L1). BN 
analyzed the H1H2L1 data, QP analyzed H1H2L1 and 
H1H2, and cWB analyzed all four data sets. In general, 
the contribution to the upper limit due to a given pipeline 
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FIG. 6: QPipeline triggers after category 2 DQFs were ap- 
plied. Overlaid histograms for H1H2 correlated energy for un- 
shifted H1H2 triggers (dots) and mean background estimated 
from time-shifted triggers (stair-step curve). The narrow er- 
ror bars indicate the statistical uncertainty of the background 
estimate, while the shaded band indicates the expected root- 
mean-square statistical fluctuations on the number of back- 
ground triggers in each bin. The top panel represents the 
triggers with central frequency below 200 Hz while the bot- 
tom panel represents the triggers with central frequency above 
200 Hz. 



FIG. 7: BlockNormal triggers after category 2 DQFs were ap- 
plied. Overlaid histograms for T for unshifted H1H2L1 trig- 
gers (dots) and mean background estimated from time-shifted 
triggers (stair-step curve). The narrow error bars indicate the 
statistical uncertainty of the background estimate, while the 
shaded band indicates the expected root-mean-square statis- 
tical fluctuations on the number of background triggers in 
each bin. The top panel represents the triggers with central 
frequency below 200 Hz while the bottom panel represents the 
triggers with central frequency above 200 Hz. 



and data set increases with both the detection efficiency 
of the pipeline and the livctimc of the data set. Since 
the duty cycle of the H1L1 and H2L1 data sets is small 
(2.4% and 4.5% after category 3 DQFs and category 3 
vetoes, vs. 37.2% and 22.5% in H1H2L1 and H1H2), and 
the data quality not as good, we decided a priori to not 
include these data sets in the upper limit calculation. We 
are therefore left with five analysis pipeline results: BN- 
H1H2L1, QP-H1H2L1, QP-H1H2, CWB-H1H2L1, and 
cWB-H1H2. We wish to combine these 5 results to pro- 
duce a single upper limit on the rate of GW bursts of 
each of the morphologies tested. 

We use the approach described in [JT] to combine the 
results of the different search detection algorithms and 
networks. Here we give only a brief summary of the tech- 
nique. 

The procedure given in [31] is to combine the sets of 
triggers according to which pipeline (s) and/or network 
detected any given trigger. For example, in the case of 



two pipelines "A" and "B" , the outcome of the counting 
experiment is the set of three numbers n = (??a, ^ab), 
where tta is the number of events detected by pipeline A 
but not by B, ub is the number detected by B but not 
by A, and ttab is the number detected by both. (The 
extension to an arbitrary number of pipelines and data 
sets is straightforward.) Similarly, one characterizes the 
sensitivity of the experiment by the probability that any 
given GW burst will be detected by a given combination 
of pipelines. We therefore compute the efficiencies e = 
(ea, £Bi eab)] where €a is the fraction of GW injections 
that are detected by pipeline A but not by B, etc. 

To set an upper limit, one must decide a priori how to 
rank all possible observations, so as to determine whether 
a given observation n contains "more" or "fewer" events 
than some other observation n' . Denote the ranking func- 
tion by ((n). Once this choice is made, the actual set of 
unshifted events is observed, giving ft, and the rate upper 
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limit R a at confidence level a is given by 

l-a= J2 P(N\e,R a f). (9.1) 

N\((N)<((n) 

Here P(N\e, R a T) is the prior probability of observing 
TV given the true GW rate R a , the vector containing the 
livetimes of different data sets T (this is a scalar if we are 
combining results of methods analyzing the same live- 
time), and the detection efficiencies e. The sum is taken 
over all N for which C(N) < £(ri); i-e. , over all possible 
outcomes N that result in "as few or fewer" events than 
were actually observed. 

As shown in |31j . a convenient choice for the rank or- 
dering is 



C(n) = 



(9.2) 



That is, we weight the individual measurements 
(ua, bbj^ab, • ■ •) proportionally to the corresponding ef- 
ficiency (ca, £B) cab, ■ • ■)■ This simple procedure yields a 
single upper limit from the multiple measurements. From 
the practical point of view, it has the useful properties 
that the pipelines need not be independent, and that 
combinations of pipelines and data sets in which it is less 
likely for a signal to appear (relatively low q) are natu- 
rally given less weight. 

Note that for the purpose of computing the upper limit 
on the GW, we are ignoring any background. This leads 
to our limits being somewhat conservative, since a non- 
zero background contribution to n will tend to increase 
the estimated limit. 

In the present search, no events were detected by any 
analysis pipeline, so ft = 0. As shown in |31) , in this 
case the efficiency weighted upper limit procedure given 
by Eqs. (9.1 ) and (9.2 1 gives a particularly simple result: 



the procedure is equivalent to taking the logical OR of all 
five pipeline/network samples. The a = 90% confidence 
level upper limit for zero observed events, Rgo%, is given 

by 

0.1 = exp(-e totJ R 90 %T) (9.3) 
9 30 

^R90% = (9-4) 

where etot is the weighted average of all the efficiencies 
(the weight is the relative livetime) and T is the total 
observation time. Fig. [8] shows the combined rate up- 
per limits as a function of amplitude for selected sine- 
Gaussian and Gaussian GW bursts. In the limit of strong 
signals, etotT goes to 224.0 days which is the union of 
all time analyzed for the H1H2L1 and H1H2 networks 
after category 3 DQFs. The rate limit thus becomes 
0.0103 day" 1 = 3.75 yr" 1 . 



X. SUMMARY AND DISCUSSION 

The search for unmodeled GW bursts reported in this 
paper is currently the most sensitive ever performed. The 
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FIG. 8: Selected exclusion diagrams showing the 90% con- 
fidence rate limit as a function of signal amplitude for Q=9 
sine-Gaussian (top) and Gaussian (bottom) waveforms for the 
results in this paper (S5) compared to the results reported 
previously (SI, S2, and S4). 



quality of the data and the sensitivity of the data anal- 
ysis algorithms have improved since the S4 run, and the 
quantity of data available for analysis has increased by 
more than an order of magnitude. These improvements 
are reflected in the greater strain sensitivity (with h Iss no% 
values as low as ~ 6 x 10~ 22 Hz -1 / 2 ) and the tighter 
limit on the rate of bursts (less than 3.75 events per 
year at 90% confidence level) with large enough ampli- 
tudes to be detected reliably. The most sensitive previous 
search, using LIGO S4 data, achieved h rss so% sensitivites 
as low as a few times 10 _21 Hz~ 1 / 2 and a rate limit of 
55 events per year. We note that the IGEC network 
of resonant bar detectors has set a more stringent rate 
limit, 1.5 events per year at 95% confidence level [32] . 
for GW bursts near the resonant frequencies of the bars 
with h rss >8x 10~ 19 Hz~ 1/2 (see Sec. X of [14] for 
the details of this comparison). A later joint observation 
run, IGEC-2, was a factor of ^3 more sensitive but had 
shorter observation time |33j . 

In order to set an astrophysical scale to the sensitiv- 
ity achieved by this search, we now repeat the analysis 
and the examples presented for S4. Specifically, we can 
estimate what amount of mass converted into GW burst 
energy at a given distance would be strong enough to 
be detected by the search with 50% efficiency. Follow- 
ing the same steps as in |17j . assuming isotropic emis- 
sion and a distance of lOkpc we find that a 153 Hz sine- 
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Gaussian with Q = 9 would need 1.9 x 10~ 8 solar masses, 
while for S4 the figure was 10 ~ 7 M & . For a source in the 
Virgo galaxy cluster, approximately 16Mpc away, the 
same h vss would be produced by an energy emission of 
roughly 0.05 M Q c 2 , while for S4 it was 0.25 M & c 2 . 

We can also update our estimates for the detectabil- 
ity of two classes of astrophysical sources: core col- 
lapse supernovae and binary black-hole mergers. We 
consider first the core collapse supernova simulations by 
Ott. et al. [3]. In this paper gravitational waveforms 
were computed for three progenitor models: sllWW, 
ml5b6 and s25WW. From S4 to S5 the astrophysical 
reach for the sllWW and ml5b6 models improved from 
approximately 0.2 to 0.6 kpc while for s25WW it im- 
proved from 8 to 24 kpc. Second, we consider the bi- 
nary black hole merger calculated by the Goddard nu- 
merical relativity group [7j- A binary system of two 10- 
solar-mass black holes (total 20 M Q ) would be detectable 
with 50% efficiency at a distance of roughly 4Mpc com- 
pared to 1.4 Mpc in S4, while a system with total mass 
100 Mq would be detectable out to ^180 Mpc, compared 
to ~60Mpc in S4. In each case the astrophysical reach 
has improved by approximately a factor of 3 from S4 to 
S5. 

At present, the analysis of the second year of S5 is well 
underway, including a joint analysis of data from Virgo's 
VSR1 run which overlaps with the final 4.5 months of S5. 
Along with the potential for better sky coverage, posi- 
tion reconstruction and glitch rejection, the joint analysis 
brings with it new challenges and opportunities. Look- 
ing further ahead, the sixth LIGO science run and sec- 
ond Virgo science run are scheduled to start in mid 2009, 
with the two LIGO 4 km interferometers operating in an 
"enhanced" configuration that is aimed at delivering ap- 
proximately a factor of two improvement in sensitivity, 
and comparable improvements for Virgo. Thus we will 
soon be able to search for GW bursts farther out into the 
universe. 
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APPENDIX A: DATA QUALITY FLAGS 

Data quality flags are defined by the LIGO Detector 
Characterization group by carefully processing informa- 
tion on the behavior of the instrument prior to analyzing 
unshifted triggers. Some are defined online, as the data 
are acquired, while others are formulated offline. A wide 
range of DQFs have been defined. The relevance of each 
available DQF has been evaluated and classified into cat- 
egories which are used differently in the analysis, which 
we now describe. 

Category 1 DQFs are used to define the data set pro- 
cessed by the search algorithms. They include out-of- 
science mode, the 30 seconds before loss of lock, periods 
when the data are corrupted and periods when test sig- 
nals are injected into the detector. They also include 
short transients that are loud enough to significantly dis- 
tort the detector response and could affect the power 
spectral density used for normalization by the search al- 
gorithm, such as dropouts in the calibration and photo- 
diode saturations. 

Category 2 flags are unconditional post-processing 
data cuts, used to define the "full" data set used to look 
for detection candidates. The flags are associated with 
unambiguous malfunctioning with a proven correlation 
with loud transients in the GW channel, where we under- 
stand the physical coupling mechanism. They typically 
only introduce a fraction of a percent of deadtime over 
the run. Examples include saturations in the alignment 
control system, glitches in the power mains, time-domain 
calibration anomalies, and large glitches in the thermal 
compensation system. 

Category 3 DQFs are applied to define the "clean" data 
set, used to set an upper limit in the absence of a detec- 
tion candidate. Any detection candidate found at a time 
marked with a category 3 DQF would not be immediately 
rejected but would be considered cautiously, with special 
attention to the effect of the flagged condition on detec- 
tion confidence. DQF correlations with transients in the 
GW channels are established at the single interferome- 
ter level. Examples include the 120 s prior to lock-loss, 
noise in power mains, transient drops in the intensity of 
the light stored in the arm cavities, times when one Han- 
ford instrument is unlocked and may negatively affect 
the other instrument, times with particularly poor sensi- 
tivity, and times associated with severe seismic activity, 
high wind speed, or hurricanes. These flags introduce up 
to ~10% dead time. 

Category 4 flags are advisory only: We have no clear 
evidence of a correlation to loud transients in the GW 
channel, but if we find a detection candidate at these 
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FIG. 9: The two examples in the figure show the fraction of 
single interferometer (LI) KleineWelle triggers eliminated by 
category 2 (top) and category 3 (bottom) DQFs, as a function 
of a threshold on the significance. The cumulative impact on 
the lifetime is less then 7 percent (mostly from category 3 
DQFs), and the cuts are most effective for the loudest triggers. 
For example, a significance of 1000 means that if the detector 
noise were Gaussian, the noise would have a probability e -1000 
of fluctuating to produce such a loud trigger. 

times, we need to exert caution. Examples are certain 
data validation issues and various local events marked in 
the electronic logs by operators and science monitors. 

Figure [9] shows the fraction of KleineWelle triggers that 
arc eliminated by category 2 and 3 DQFs, respectively, 
in the LI interferometer, as a function of the significance 
of the energy excess identified by the trigger, which is 
evaluated assuming stationary, random noise. To ensure 
DQFs are independent of the presence of a true GW, we 
verified they are not triggered by hardware injections. 



APPENDIX B: EVENT-BY-EVENT VETOES 

Event-by-event vetoes attempt to discard GW channel 
noise events by using information from the many envi- 
ronmental and interferometric auxiliary channels which 
measure non-GW degrees of freedom. Good vetoes are 
found by looking for situations in which a short (~ms) 
noise transient in an auxiliary channel, identified by the 
KleineWelle (KW) algorithm, often coincides within a 
short interval (~100 ms) with noise transients in the GW 
channel. The work, then, is in identifying useful auxiliary 
channels which arc well correlated with noise transients 
in the GW data, choosing the relevant veto parameters 
to use, and finally establishing that the veto procedure 
will not systematically throw out true GWs. As for the 



data quality flags, vetoes are defined prior to generat- 
ing triggers from unshifted data. The trigger properties 
used for veto studies are the KW signal energy-weighted 
central time and the KW statistical significance. The 
correlation between noise events in the GW channel and 
an auxiliary channel is determined by a comparison of the 
coincidence rate measured properly and coincidence rate 
formed when one of the time series has been artificially 
time-shifted with respect to the other. Alternatively, we 
can compare the number of coincidences with the number 
expected by chance, assuming Poisson statistics. 

As for the DQFs, category 2 vetoes are defined us- 
ing only a few subsets of related channels, showing the 
more obvious kinds of mechanisms for disturbing the in- 
terferometers - either vibrational or magnetic coupling. 
Furthermore, for this S5 analysis we insist that multiple 
(3 or more) channels from each subset be excited in coin- 
cidence before declaring a category 2 veto, to ensure that 
a genuine disturbance is being measured in each case. By 
contrast, the category 3 vetoes use a substantially larger 
list of channels. The aim of this latter category of veto 
is to produce the optimum reduction of false events for a 
chosen tolerable amount of livetime loss. 



a. Veto effectiveness metrics 

Veto efficiency is defined for a given set of triggers as 
the fraction vetoed by our method. We use a simple 
veto logic where an event is vetoed if its peak time falls 
within a veto window, and define the veto dead-time frac- 
tion to be the fraction of livetime flagged by all the veto 
windows. Assuming that real events are randomly dis- 
tributed in time, dead-time fraction represents the proba- 
bility of vetoing a true GW event by chance. We will refer 
to the flagged dead-time as the veto segments. A veto ef- 
ficiency greater than the dead-time fraction indicates a 
correlation between the triggers and veto segments. 

Under either the assumption of randomly distributed 
triggers, or randomly distributed dead-time, the number 
of events that fall within the flagged dead-time is Pois- 
son distributed with mean value equal to the number of 
events times the fractional dead-time, or equivalently, the 
event rate times the duration of veto segments. We de- 
fine the statistical significance of actually observing N 
vetoed events as S(N) — — log [Pp i SS (£ > N)]. 

We must also consider the safety of a veto condition: 
auxiliary channels (besides the GW channel) could in 
principle be affected by a GW, and a veto condition de- 
rived from such a channel could systematically reject a 
genuine signal. Hardware signal injections imitating the 
passage of GWs through our detectors, performed at sev- 
eral pre-determined times during the run, have been used 
to establish under what conditions each channel is safe 
to use as a veto. Non-detection of a hardware injection 
by an auxiliary channel suggests the unconditional safety 
of this channel as a veto in the search, assuming that a 
reasonably broad selection of signal strengths and fre- 
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quencies were injected. But even if hardware injections 
are seen in the auxiliary channels, conditions can readily 
be derived under which no triggers caused by the hard- 
ware injections are used as vetoes. This involves imposing 
conditions on the strength of the triggers and/or on the 
ratio of the signal strength seen in the auxiliary channel 
to that seen in the GW channel. 

Veto safety was quantified in terms of the probability 
of observing >N coincidence events between the auxil- 
iary channel and hardware injections vs. the number of 
coincidences expected from time-shifts. 

The observed concident rate is a random variable 
itself that fluctuates around the true coincident rate. 
In the veto analysis we use the 90% confidence upper 
limit on the background coincidence rate which can 
be derived from the observed coincidence rate. This 
procedure makes it easier to consider a veto safe than 
unsafe and the reason for this approach was to lean 
toward vetoing questionable events. A total of 20 
time-shifts were performed. The analysis looped over 
7 different auxiliary channel thresholds and calculated 
this probability, and a probability of less than 10% 
caused a veto channel at and below the given threshold 
to be judged unsafe. A fixed 100 ms window between 
the peak time of the injection and the peak time of 
the Klcinc Welle trigger in the auxiliary channel was used. 

All channels used for category 2 vetoes were found to 
be safe at any threshold. Thresholds for category 3 veto 
channels were chosen so as to ensure that the channel 
was safe at that threshold and above. 



b. Selection of veto conditions 

For the purpose of defining conservative vetoes ap- 
propriate for applying as category 2 (before looking for 
GW detections), we studied environmental channels. We 
found that these fall into groups of channels that each 
veto a large number of the same events. Based on this 
observation, three classes of environmental channels were 
adopted as vetoes. For LHO these classes were 24 mag- 
netometers and voltmeters with a KW threshold of 200 
and time window of 100 ms, and 32 accelerometers and 
seismometers with a threshold on the KW significance of 
100 and a time window of 200 ms. For LLO these were 
12 magnetometers and voltmeters with a KW threshold 
of 200 and a time window of 100 ms. We used all of the 
channels that should have been sensitive to similar effects 
across a site, with the exception that channels known to 
have been malfunctioning during the time period were 
removed from the list. 

To ensure that our vetoes are based on true environ- 
mental disturbances, a further step of voting was imple- 
mented. An event must be vetoed by three or more 
channels in a particular veto group in order to be dis- 
carded from the detection search. These conditions re- 
move ~0.1% from the S5 livetime. 



In the more aggressive category 3 vetoes, used for 
cleaning up the data for an upper limit analysis, we draw 
from a large number of channels (about 60 intcrfcromct- 
ric channels per instrument, and 100 environmental chan- 
nels per site). This task is complicated by the desire to 
choose optimal veto thresholds and windows, and the fact 
that the veto channels themselves can be highly corre- 
lated with each other so that applying one veto channel 
changes the incremental cost (in additional dead-time) 
and benefit (in additional veto efficiency) of applying an- 
other. Applying all vetoes which perform well by them- 
selves often leads to an inefficient use of dead-time as 
dead-time continues to accumulate while the same noise 
events are vetoed over and over. 

For a particular set of GW channel noise events, we 
adopt a "hierarchical" approach to choose the best sub- 
set of all possible veto conditions to use for a target dead- 
time. This amounts to finding an ordering of veto condi- 
tions (veto channel, threshold, and window) from best to 
worst such that the desired set of veto conditions can be 
made by accumulating from the top veto conditions so 
long as the dead-time does not exceed our limit, which is 
typically a few percent. 

We begin with an approximately ordered list based on 
the performance of each veto condition (channel, window, 
and threshold) considered separately. Incremental veto 
statistics are calculated for the entire list of conditions 
using the available ordering. This means that for a given 
veto condition, statistics are no longer calculated over the 
entire S5 livetime, but only over the fraction of livetime 
that remains after all veto conditions earlier in the list 
have been applied. The list is then re-sorted according 
to the incremental performance metric and the process is 
repeated until further iterations yield a negligible change 
in ordering. 

The ratio of incremental veto efficiency to incremental 
dead-time is used as a performance metric to sort veto 
conditions. This ratio gives the factor by which the rate 
of noise events inside the veto segments exceeds the aver- 
age rate. By adopting veto conditions with the largest in- 
cremental efficiency/dead-time ratio, we maximize total 
efficiency for a target dead-time. We also set a threshold 
of probability P < 0.001 on veto significance (not to be 
confused with the significance of the triggers themselves) . 
This is particularly important for low-number statistics 
when large efficiency/dead-time ratios can occasionally 
result from a perfectly random process. 

Vetoes were optimized over several different sets of GW 
channel noise events including low-threshold H1H2L1 co- 
herent WaveBurst time-shifted events, H1H2 coherent 
WaveBurst playground events, as well as QPipeline and 
KlcincWclle single-interferometer triggers. For example, 
the effect of data quality flags and event-by-event vetoes 
on the sample of coherent WaveBurst time-shifted events 
is shown in Fig. [10] Our final list of veto segments to 
exclude from the S5 analysis is generated from the union 
of these individually-tuned lists. 
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FIG. 10: Top: Accumulated veto efficiency versus dead-time 
as vetoes are applied cumulatively down the veto list. The 
best vetoes are applied first, so we see a general decrease in 
the effectiveness of vetoes at higher dead-time. Vetoes from 
environmental channels are artificially prioritized over inter- 
ferometric channels, giving rise to the knee in the plot around 
0.8% deadtime where the environmental vetoes are exhausted. 
Bottom: Histogram of coherent network amplitude, 77, for 
coherent WaveBurst time-shifted (background) events repre- 
senting 100 S5 livetimes. The different shades show events 
removed by data quality cuts and vetoes at various stages in 
the analysis. 



APPENDIX C: THE BLOCKNORMAL BURST 
SEARCH ALGORITHM 

1. Overview 

The BlockNormal analysis pipeline follows a similar 
logic to the S4 burst analysis [17] by looking for bursts 
that are both coincident and correlated. The BlockNor- 
mal pipeline uses a change-point analysis to identify co- 
incident transient events of high significance in each de- 
tector's data. The subsequent waveform correlation test 
is the same as that used in the S4 analysis. 

A unique feature of the BlockNormal analysis is that it 
can be run on uncalibrated time series data — neither the 



TABLE III: Frequency Bands for BlockNormal Analysis 



change point analysis nor the correlation test are sensitive 
to the overall normalization of the data. 



2. Data conditioning 

The BlockNormal search operated on the frequency 
range 80 to 2048 Hz. To avoid potential issues with 
the additional processing and filtering used to create cal- 
ibrated data, and to be immune to corrections in the 
calibration procedure, the analysis was run on the uncal- 
ibrated GW channel from the LIGO interferometers. 

The data conditioning began with notch filters to sup- 
press out-of-band (below 80 Hz or above 2048 Hz) spec- 
tral features such as low-lying calibration lines, the strong 
60 Hz power-line feature and violin mode harmonics just 
above 2048 Hz. The time-series data were then down- 
sampled to 4096 Hz to suppress high-frequency noise. 
The power-line harmonics in each band were removed 
using Kalman filters |34l 135) . The large amount of power 
at low frequencies in the uncalibrated GW channel was 
suppressed with a highpass filter designed with the Parks- 
McClellan algorithm. 

Because the BlockNormal method is purely a time- 
domain statistic, the interferometer data must be divided 
into frequency bands to achieve a degree of frequency 
resolution on the bursts. For this analysis, 12 frequency 
bands approximately 150 Hz in bandwidth spanned the 



range from 80 Hz to 2048 Hz (see Table |Hij). There are 
gaps between some bands to avoid the significant non- 
stationary noise from the violin modes of the mirror sus- 
pension wires. 

The division into the twelve frequency bands was done 
using a basebanding procedure. Any calibration lines 
within the band were removed by low-order regression 
filtering against the calibration line injection channel 
data. A final whitening filter of modest order was ap- 
plied in each band to satisfy the BlockNormal statistic 's 
assumption of Gaussianity in the background noise. The 
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data conditioning procedures also had to minimize mix- 
ing noise characteristics between different time periods 
for the change-point analysis, and thus could not rely on 
predictive filtering. 



3. Change-point analysis 

The BlockNormal algorithm uses a Bayesian statistic 
termed p2 to perform a change-point analysis using the 
noise characteristics of time-series data. For an inter- 
val of N time-series samples x[k], this statistic measures 
the statistical likelihood (at each sample k within that 
interval) that the data prior to that point are more con- 
sistent with a different Gaussian-distributed (or normal) 
noise source than are the data following that point. It is 
defined as 
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The quantity K p is a constant proportional to f3R/f s , 
where (3 is the prior probability, R the desired rate of 
blocks, and f s the sample rate. In fact each interval 
is searched for all change-points where p2,fe exceeds a 
threshold value pe, where pe is implemented as a num- 
ber times K p . The sub-intervals between change-points 
are termed "blocks" . The statistical significance of each 
such block is based on its "excess power" £* defined as 



C =Nx (p 2 + v )/(j£ + vq) 



Xn 



(C5) 



where the block has mean p and variance v against a 
background of mean po and variance vq. Events were 
selected by requiring the negative-log-likelihood of £* 
(termed Ag) to exceed a threshold. Here 



A E = -ln(PrK > C\) (C6) 



where 



PrK > C] = l(N/2, e/2)h(N/2). (C7) 

The variance-weighted time centroid, t^ 2 \ of each 
event of n samples of amplitude x% and time tj was cal- 
culated: 



r (2) = J2i=i t dpj - a*) 2 = EiUMgi - 

Er=i(^-^) 2 (n-l)v 



(C8) 



The calibrated band-limited strain energy of each event 
was estimated using the frequency-averaged response 
R(f) over that band: 



E f = R(f)(p 2 n + v(n-l)) 



(C9) 



The BlockNormal algorithm was applied separately to 
the data in each frequency band ( Table [nT| to select can- 
didate GW burst events. The burst event generation was 
done on relatively long-duration epochs (up to 1200 s) of 
continuous data to provide the best measure of the back- 
ground noise characteristics. 

Prior to the network coincidence step, events within 
each frequency band that are nearly adjacent were clus- 
tered into composite events. Then, events between ad- 
jacent frequency bands whose time centroids were close 
were clustered into composite multiband events. All 
events were then characterized by their frequency cov- 
erage. For composite events, the effective time centroid 
was the energy-weighted average of the time centroid of 
the constituent events. The band-limited energy for com- 
posite events was simply the sum of the per-event ener- 
gies. The central frequency for events in a single band 
was estimated by the average frequency of that band. For 
multi-band events, the energy- weighted average of these 
central frequencies was used. 



4. Network coincidence 

The signals from actual GW bursts in the LIGO inter- 
ferometers should be separated in time by no more than 
the maximum transit time (10 ms) for GW between the 
Hanford and Livingston sites. For the co-located interfer- 
ometers at Hanford, there should be no time separation. 
The separation observed in the reconstructed events is 
larger due to limited time resolution, phase-delays in fil- 
tering, etc. For a candidate trigger, the time difference 
between candidate events in each pair of interferometers, 



-(2) J2), 



was required to fall within a fixed coincidence 



window, ATij, for that pair of interferometers. This coin- 
cidence window had to be much broader than the transit 
time to account for limited time resolution and skewing of 
the time distributions from differential antenna response 
to /i+ and h x waveforms. 

The signals from actual GW bursts should also have 
similar strain amplitude (and hence statistical signifi- 
cance) in each interferometer. We derived a measure of 
coincident significance from the excess power significance 
Ae in each candidate event in the trigger. This measure 
must correct for the lower significance for GW signals in 
the shorter H2 interferometer (as compared to the HI 
interferometer) as well as the fluctuation of the relative 
GW signal strengths at the two LIGO sites due to mod- 
ulation from the antenna factors. The chosen metric for 
coincident significance, termed "combined power" or Pq , 
was defined as 



Pc = (A^mA^mA^Li) 



1/3 



(CIO) 
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This formulation was found to have the best performance 
in optimizing sensitivity to GW burst signals as a func- 
tion of the background trigger rate. 

The coincidence procedure first identified events from 
each of the three detectors that had overlapping fre- 
quency coverage. These events then had to have time 
centroids whose difference AT was less than 100 ms. Such 
time-coincidence events were retained as GW burst trig- 
gers if their combined power Pq was above a threshold 
of 22. 



TABLE IV: Cuts used by the BlockNormal-CorrPower 
pipeline in the first year of S5. The parameters are: combined 
power Pc, overall Corr Power T value, Corr Power V values for 
various detector pairs, H1-H2 correlation Ro, and estimated 
ferss values in HI and H2. 

H1H2L1 Network 

Pc > 2 

P > 5.0 for / < 200 Hz 
T > 3.8 for / > 200 Hz 
TH1H2 > 0.5, Thili > 0.3, Th2L1 > 0.3 

Ro > 

| log 10 (/l rss ,Hl//l rss ,H 2 ) < 0.4 



5. Network correlation 



The signals from GW bursts in each interferometer re- 
sult from the same parent waveforms, and thus should 
have a large correlation sample-by-sample (after correc- 
tion for propagation delay). The cross-correlation statis- 
tic r reported by the CorrPower [36 package is the max- 
imum of the average correlation confidence of pair-wise 
correlation tests. It is positive-definite. Larger values 
denote greater statistical certainty of coherence. The 
CorrPower package was run on the list of candidate trig- 
ger times produced in the coincidence step. It retrieved 
the full time-series data from each interferometer around 
that time, calibrated the data, and calculated the T cross- 
correlation statistic. For the three LIGO interferometers, 
cuts were also made on the three pair-wise correlation 
tests. 

Additional selection criteria took advantage of the spe- 
cial relationship for GW signals from the co-located in- 
terferometers HI and H2. One was the signed correla- 
tion factor between the HI and H2 interferometers from 
the CorrPower processing, termed R . For triggers from 
GW bursts, this correlation factor should be positive. 
For triggers from a background of random coincidences, 
there should be an equal number of positive and negative 
correlation factors. Also, since the HI and H2 interfer- 
ometers receive the same GW signal, the ratio of /i rss ,H2 
to /i r ss,m should be close to one for a true GW burst. In 
contrast, for triggers from a random background this ra- 
tio will be centered around one-half. This arises because 
the H2 interferometer is approximately half as sensitive 
as HI, so signals of the same statistical significance (near 
the threshold) will have only one-half the amplitude in 
H2 as they do in HI. To simplify thresholding, the ab- 
solute value of the logarithm of the ratio was calculated 
-Rhih2 = |log 10 (/i rss ,Hi/' l rs S ,H2)| for later use. 

The choices of tuning parameters are described in Ta- 
ble |IV| Figure [TT] illustrates an example of plots used to 
tune the figures of merit for the H1H2L1 network. 
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FIG. 11: Distribution of background and injection events with 
respect to the CorrPower P. The narrow black histogram 
represents the background (noise) triggers while the broader 
histogram represents the distribution of the injections. These 
triggers were generated in the H1H2L1 network and contain 
frequencies below 200 Hz. The vertical line indicates the cut 
made on this quantity. 



APPENDIX D: THE QPIPELINE BURST 
SEARCH ALGORITHM 

1. Overview 

QPipeline is an analysis pipeline for the detection of 
GW bursts in data from interfcrometric gravitational 
wave detectors [2B]. It is based on the Q transform [57], a 
multi-resolution time-frequency transform that projects 
the data under test onto the space of bisquare-windowed 
complex exponentials characterized by central time r, 
central frequency /o, and quality factor Q: 



/+oo 
x(f)w(f,f ,Q)e +i2 ^ T df, (Dl) 
- CO 
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where the bisquare window w(f, /o, Q) is 

2 



.4 




1 



fQ 

foV5l 



for / < 



/oV5.5 



Q 



(D2) 



otherwise 



with 



.4 



/ 315 Q 
~ U28%/5J5/o 



1/2 



(D3) 



The bisquare window is a close approximation to a 
Gaussian window in frequency space; the QPipcline is ef- 
fectively a templated matched filter search [37J for signals 
that are Gaussian enveloped sinusoids in the whitened 
signal space. 



2. Data conditioning 

Before applying the Q transform, the data are first 
whitened by zero-phase linear predictive filtering [26 , 38J . 
In linear predictive whitening, the nth sample of a dis- 
crete data sequence is assumed to be well modeled by a 
linear combination of the previous M samples: 



M 



= E <* 



(D4) 



The resulting whitened data stream is the prediction er- 
ror sequence e[n] = x[n] — x[n] that remains after se- 
lecting the coefficients c[m] to minimize the error in the 
least-squares sense. 

The prediction error length M is taken to be equal to 
the length of the longest basis function under test, which 
is approximately 1 second. This ensures that the data 
are uncorrelated on the time scales of the analysis. 

In order to avoid introducing phase errors between 
detectors, a modified zero-phase whitening filter is con- 
structed by zero-padding the initial filter, converting to 
the frequency domain, and discarding all phase informa- 
tion. 



are selected to cover a targeted region of signal space, 
and are spaced such that the fractional signal energy loss 
—5Z/Z due to the mismatch St, 8fo, and SQ between 
an arbitrary basis function and the nearest measurement 
template, 



-SZ 2tt 2 /o 2 



St 2 



l + Q 2 



1 



SQ 2 



:Sfo> 



2f 2 " JU ' f oQ * 

(D5) 

is no larger than ^20%. This naturally leads to a tiling 
of the signal space that is logarithmic in Q, logarithmic 
in frequency, and linear in time. 

For this search, the QPipcline was applied to search 
the space of sinusoidal Gaussians with central frequency 
from 48 Hz to 2048 Hz, and with Q from y/EI to 100/ v^. 



4. Trigger generation 

The statistical significance of Q transform projections 
are given by their normalized energy Z, defined as the ra- 
tio of squared projection magnitude to the mean squared 
projection magnitude of other templates with the same 
central frequency and Q. For the case of ideal white 
noise, Z is exponentially distributed and is related to the 
matched filter SNR quantity p [37] by the relation 

Z= \X\ 2 /(\X\ 2 ) T = -lnPr[Z' > Z] = p 2 /2 . (D6) 

The Q transform is applied to the whitened data and 
normalized energies are computed for each measurement 
template as a function of time. Templates with statis- 
tically significant signal content are then identified by 
applying a threshold on the normalized energy. Finally, 
since a single event may potentially produce multiple 
overlapping triggers due to the overlap between measure- 
ment templates, only the most significant of overlapping 
templates are reported as triggers. 

Clustering of nearby triggers is not used in evaluating 
the significance of events. As a result, the detectability of 
GW burst signals depends on their maximum projection 
onto the space of Gaussian enveloped sinusoids. 



Measurement basis 



Coherence 



The space of Gaussian enveloped complex exponen- 
tials is an over-complete basis of waveforms, whose du- 
ration a t and bandwidth cry have the minimum pos- 
sible time- frequency uncertainty, (Jt&f = 1/47T, where 
Q = /o/v2o"/- As a result, they provide the tightest pos- 
sible constraints on the time-frequency area of a signal, 
maximizing the measured signal to noise ratio (SNR) and 
minimizing the probability that false triggers are coinci- 
dent in time and frequency between multiple detectors. 

In practice, the Q transform is evaluated only for a 
finite number of basis functions, which are more com- 
monly referred to as templates or tiles. These templates 



For this search, the QPipeline took advantage of the 
co-located nature of the two LIGO Hanford detectors to 
form two linear combinations of the data streams from 
the two detectors. This coherent analysis makes use of 
correlations in the data to distinguish true GW signals 
from instrumental glitches. 



a. Coherent signal stream 

The first combination is the coherent signal stream, 
H+, a frequency dependent weighted sum of the data 
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from the Hanford detectors which maximizes the effective 
SNR. The weighting is inversely proportional to the noise 
power spectral density, S(f) : 



*H+(/) 



1 

Shi 



1 

Sh2 



xm(f) x H2 (f) \ 
Sm(f) SMJ 



(D7) 



The resulting combination is treated as the output of 
a new hybrid, "coherent" detector. Under the assump- 
tion that the power spectral density is approximately flat 
across the window bandwidth, applying the Q transform 
to this data stream leads to a coherent energy value, 
I^h+I 2 ' w hich takes the following form: 



— + — 

Shi -Sh2 t 

I^hi 1 2 _|_ \Xh2 1 2 
S H1 S„ 



J H2 



(D8) 

where Xhi, Xh2, and Xgr are functions of r, /o, and 
Q, and the asterisk denotes complex conjugation. The 
last term represents the contribution of the cross-term, 
and is conceptually similar to a frequency domain rep- 
resentation of a cross-correlation of the HI and H2 data 
streams. 

The energy expected in the coherent data stream if 
there were no correlations in the data can be character- 



ized by the "incoherent" terms in Eq. (D8) 



\X 



H+l 



1 1 

Shi Sh2 



|^Hl| 2 , |-^H2| 2 
~~ o2 
J H2 



o2 
°H1 



(D9) 

The coherent and incoherent energies can then be nor- 
malized in the manner of Eq. ( D6 1 : 



7coh 



\x 



colli 2 



/7111c I xrx 



/<I*H 

7(1*: 



coh|2\ 



inc |2\ 
H+l / 



(D10) 
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The correlation between the detectors can then be mea- 
sured by the correlated energy, Z^™, given by 



ycorr recoil yinc 

Zj H+ — Z/ H+ — Z 'H+ 



-^H1^H2 + -^H1^H2 
Sm + Sh2 



(D12) 



b. Null stream 

The second combination is the difference between the 
calibrated data from the two detectors, known as the null 
stream, and is defined as 



%-(/) = &Hl(/) " *H2(/)< 



(D13) 



By subtracting the co-located streams, any true gravi- 
tational wave signal should be canceled. The resulting 
combination is treated as the output of a new hybrid 
"H— " detector, which shows significant energy content 
in the presence of instrumental glitches but does not re- 
spond to gravitational waves. Glitches are identified by 



thresholding on the corresponding normalized "null en- 
ergy", Zjj^, calculated in an analogous manner to Z^r. 

Signal tiles found to be in coincidence with significant 
null stream tiles are vetoed as instrumental glitches, and 
are not considered as candidate events. The threshold on 
^h°- can he expressed as 



Z^ >a + f3Z£t 



(D14) 



where a is chosen to limit the veto rate in Gaussian noise 
to ~ 1 per 2048 tiles and (3 is a parameter corresponding 
to the allowed tolerance in calibration uncertainty. This 
is an energy factor, and corresponds to an amplitude cal- 
ibration uncertainty of approximately 22 percent. 

We expect that highly energetic instrumental glitches 
could leak energy into adjacent time- frequency bins, so 
the veto coincidence requirement between signal and null 
streams is scaled to give more-significant null stream tiles 
more area of veto influence in time-frequency space: 

|r H - - T H+ | < (St h _ + 6 m+ )/2, (D15) 

I/03- - /o,H+| < (SfoM- + */o,h+)/2 (D16) 

where r and /o are the central time and frequency of a 
tile, 5t and Sf are the duration and bandwidth of a tile, 
and the inflated null stream tile duration and bandwidth 
are defined as: 



St h _ = max 1, 0.5 J2Z™} x <5r H - 



g\4 h _ = max l,0.5 x /2Z£h x Sf , H 



(D17) 



(D18) 



6. Coincidence 

Coherent triggers from the two LIGO Hanford detec- 
tors were also tested for time-frequency coincidence with 
triggers from the LIGO Livingston detector using the fol- 
lowing criteria, where T is the speed of light travel time 
of 10 ms between the two LIGO sites: 



\m - n\ < max(^T H ,<5r L )/2 + r, 



I/03 - /o,l| < max((5/ ,H, gi/ ,l)/2 ■ 



(D19) 



(D20) 



Coincidence between the LIGO Hanford and Liv- 
ingston sites is not a requirement for detection, even if 
detectors at both sites are operational. The final trig- 
ger set is the union of triggers from the coherent H1H2 
trigger set and the coincident H1H2L1 trigger set. The 
additional requirement of coincidence permits a lower 
threshold, and therefore greater detection efficiency, for 
the H1H2L1 data set. 

The choices of tuning parameters are described in Ta- 
ble [V] Figure [12] an example scatter plot used to tune the 
figures of merit for the H1H2L1 network. 
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TABLE V: Cuts used by the QPipeline analysis in the first 
year of S5. The parameters are: H1/H2 coherent significance 
Zh+; H1/H2 correlated significance Z^™, and LI normalized 
energy Z u . 

H1H2L1 Network 



^H+ r > max I 15, 50 



^H+ r > niax 5, 30 



Z L i > 12.5 



J 12.5 
I2J5 



for / < 200 Hz 
for / > 200 Hz 



H1H2 Network 


Z, H+ 
Z 'H+ 


> 20 

> 50 

> 30 


for / < 200 Hz 
for / > 200 Hz 




10 10 10 

H1 H2 correlated energy 



10 



FIG. 12: Scatt er plo t of the H1H2 correlated energy , 
[defined in Eq. |D12|], which measures the correlation of the 



strain at the two Hanford interferometers, versus the LI nor- 
malized energy [defined in Eq. ( |D6| |]. The distribution of the 
background triggers is displayed in black while the distribu- 
tion of simulated GW signals in gray. This example tuning 
plot is for triggers generated for the H1H2L1 network and 
containing frequencies below 200 Hz. The cuts on these quan- 
tities are displayed on the plot as thick lines. 



APPENDIX E: THE COHERENT WAVEBURST 
SEARCH ALGORITHM 

1. Overview 

Coherent WaveBurst (cWB) is an analysis pipeline for 
the detection and reconstruction of GW burst signals 
from a network of detectors. The reconstructed gravi- 
tational waveform h that best describes the response of 
the network is used to compute the maximum likelihood 
ratio of the putative GW signal, which forms the main 
detection statistic for the search. In effect, cWB is equiv- 



alent to a matched filter search with a very large template 
bank representing all possible time-domain signals with 
short duration. 

The cWB pipeline is divided into three main stages: 
the generation of coherent triggers, the reconstruction of 
the GW signal and the computation of the maximum 
likelihood ratio, and a post-production stage where addi- 
tional detection cuts are applied. By using weighted co- 
herent combinations of the data streams, cWB is not lim- 
ited by the least sensitive detector in the network. The 
waveform reconstruction allows various physical proper- 
ties of the signal to be estimated, including the sky loca- 
tion of the source. The coherent approach also allows for 
other statistics to be constructed, such as the null stream 
and coherent energy, to distinguish genuine GW signals 
from environmental and instrumental artifacts. 



2. Data conditioning and time-frequency 
decomposition 

The cWB analysis is performed in the wavelet domain. 
A discrete Meyer wavelet transformation is applied to the 
sampled detector output to produce a discrete wavelet 
series a,k[i,j], where i is the time index, j is the scale in- 
dex and k is the detector index. An important property 
of Meyer wavelets is that they form an orthonormal ba- 
sis that allow for the construction of wavelet filters with 
small spectral leakage [28]. Wavelet series give a time- 
scale representation of data where each wavelet scale can 
be associated with a certain frequency band of the initial 
time series. Therefore a wavelet time-scale spectrum can 
be displayed as a time-frequency (TF) scalogram, where 
the scale is replaced with the central frequency / of the 
band. The time series sampling rate R and the scale num- 
ber j determine the time resolution Atj(R) at this scale. 
The frequency resolution Afj is defined as l/(2Atj) and 
determines the data bandwidth at the scale j. The time- 
frequency resolution defines the tiling of the TF plane. 
The individual tiles (pixels) represent data samples in the 
wavelet domain. In the cWB pipeline a uniform tiling is 
used (Afj(R) = R/2 n , where n is the wavelet decompo- 
sition depth), which is obtained with the Meyer packet 
transformation [39 . In this case the TF resolution is the 
same for all wavelet scales. For optimal localization of 
the GW energy in the TF plane, the cWB analysis is 
performed at six different frequency resolutions: 8, 16, 
32, 64, 128 and 256 Hz. 

Before the coherent analysis is performed, two data 
conditioning algorithms are applied to the data in the 
wavelet domain: a linear prediction error (LPE) filter and 
a wavelet estimator of the power spectral density Sk [j] ■ 
LPE filters are used to remove "predictable" components 
from an input data series. In the cWB pipeline they 
are constructed individually for each wavelet layer and 
remove such components in the data as power line har- 
monics and violin- mode lines. A more detailed descrip- 
tion of the LPE filters can be found elsewhere [2HJ SO] ■ 
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The wavelet estimator of the one-sided power spectral 
density associated with each wavelet layer j is 



S k [j] = 2- 



R 



(El) 



where cr 2 [j] is the variance of the detector noise. In the 
analysis we assume that the detector noise is Gaussian 
and quasi-stationary. The variance estimator may vary 
with time and therefore it is calculated for each sample 
in the wavelet layer: al[i,j). The estimation of the noise 
variance is performed on data segments of length 60 sec- 
onds, with 40 seconds overlap. Linear interpolation is 
used between two measurements to obtain 



3. Coherent triggers 

The first step in the analysis is to identify segments of 
data that may contain a signal. The triggers are evalu- 
ated using the whitened data vector w[i,j] 



w[i,j](M) = 



a%[i, (#,</>)] a K [i,j,TK{0,4>)] 



^1 »,j 



<rK[i,j] 



(E2) 

The sampled detector amplitudes in the wavelet domain 
a,k[i,j, , Tfc] take into account the time delays Tu due to 
the time-of-flight between the detectors, which in turn 
depend on the source coordinates 9 and </>. Coherent 
triggers are generated for the entire network by max- 
imizing the norm |w[z,j]| over the entire sky for each 
time-frequency location To do this, the sky is di- 

vided into square degree patches and the quantity |w| is 
calculated for each patch from the delayed detector am- 
plitudes dk[i,j,Tk]. By selecting clusters of pixels with 
the maxg^ |w| above some threshold, one can identify 
coherent triggers in the time-frequency plane. The data 
pixels Wk[i, j] selected by this procedure are then used to 
reconstruct the GW signal and compute the maximum 
likelihood statistic. 



4. Maximum likelihood ratio functional 

For the case of Gaussian quasi-stationary noise, the 
likelihood that data a is purely instrumental noise is pro- 
portional to exp{— (a\a)/2}, while the likelihood that a 
GW signal h is present is proportional to exp{— (a — h\a— 
h)/2}. The ratio of these likelihoods can be used as a 
detection statistic. Here (x\y) defines a noise weighted 
inner product, which for K detectors with uncorrelated 
noise can be written in the wavelet domain as 



K 

E E 

k— 1 i,j£Q,TF 



XklhJWklhJl 



(E3) 



where time i and frequency j indices run over some time- 
frequency area £Itf selected for the analysis. The co- 
herent WaveBurst pipeline defines £ as twice the (log) 



likelihood ratio, and treats it as a functional in /id ct (h) 
HH: 



£[h] = 2(a\h det ) - (hd et \h de t) , 



(E4) 



where h* et [i,j] are the detector responses (Eq. 7.2 1. The 
network sensitivity is characterized by the noise-scaled 
antenna pattern vectors f + and f x : 



r+(x)MJ 



F lj+(x) (fi,*) F KMx) (Sl,ar) 



<7i[i,j] 



OK[hj] 



(E5) 

Since the detector responses h\ et are independent of rota- 
tion by an arbitrary polarization angle in the wave frame, 
it is convenient to perform calculations in the dominant 
polarization frame (DPF) |40j . In this frame the antenna 
pattern vectors f + and f x are orthogonal to each other: 



(t + (y DPF )-f x (y DPF )) = Q 



(E6) 



and we refer to them as fi and f 2 respectively. The corre- 
sponding solutions for the GW waveforms, h\ and h 2 , are 
found by variation of the likelihood functional (Eq. (E4|) 
that can be written as the sum of two terms, £ = £!+£ 2 , 
where 



£i = £ pKwfO/ii-lfil 2 /^] , 
C 2 =Y, [2(w-f 2 )/i 2 -|f 2 | 2 ^] . 



(E7) 
(E8) 



The estimators of the GW waveforms for a particu- 
lar sky location are then the solutions of the equations 
8Cx/8hx = and SC 2 /Sh 2 = 0: 



^(wfiVIM 2 , 

/l 2 = (w-f 2 )/|f 2 | 2 



(E9) 
(E10) 



Note, the norms |fi| and |f 2 | characterize the network 
sensitivity to the hi and h 2 polarizations respectively. 
The maximum likelihood ratio statistic for sky location 
(8, (j)) is calculated by substituting the solution for h into 
£[h]. The result can be written as 



K 



K 



w n w rn P nm , 

n,m=l n,m— 1 Qtf 

(Ell) 

where the matrix P is the projection constructed from 
the components of the unit vectors ei and e 2 along the 
directions of the fi and f 2 respectively: 



Pn 



ei«ei m + e 2n e 2r , 



(E12) 



The kernel of the projection P is the signal plane defined 
by these two vectors. The null space of the projection P 
defines the reconstructed detector noise which is referred 
to as the null stream. 

The projection matrix is invariant with respect to the 
rotation in the signal plane where any two orthogonal 
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unit vectors can be used for construction of the P nm . 
Therefore one can select vectors u and v such that (w • 
v) = and 

Pnm — ^n^m ■ (E13) 

The unit vector u defines the vector 



Namely, the vector e' 2 = f2/|f2 1 an d the corresponding 
vector u' obtained by rotation of ei and e' 2 in the signal 
plane are not unit vectors if 8 ^ 0. To fix this problem we 
re-normalize the vector u' to unity and use it for calcula- 
tion of the maximum likelihood ratio and other coherent 
statistics. 



£ = (w • u)u 



(E14) 



whose components are estimators of the noise-scaled de- 
tector responses h^ et [i,j]/ak[i,j]- 

5. Regulators 

In principle the likelihood approach outlined above can 
be used for the reconstruction of the GW waveforms and 
calculation of the maximum likelihood statistic. In prac- 



tice the formal solutions (E9l, (E10) need to be regular 



ized by constraints that account for the way the network 
responds to a generic GW signal [40 . For example, the 
network may be insensitive to GW signals with a partic- 
ular sky location or polarization, resulting in an ill-posed 
inversion problem. These problems are addressed by us- 
ing regulators and sky-dependent penalty factors. 

A classical example of a singular inversion problem is a 
network of aligned detectors where the detector responses 
^det are identical. In this case the algorithm can be con- 
strained to search for one unknown function rather than 
for the two GW polarizations hi and hi, which span a 
larger parameter space. Note that in this case | fa | = 0, 
Eq. (E10 1 is ill-conditioned and the solution for the hi 
waveform cannot be found. Regulators are important 
not only for aligned detectors, but also for networks of 
misaligned detectors, for example, the LIGO and Virgo 
network. Depending on the source location, the network 
can be much less sensitive to the second GW component 
( | f 2 1 2 << |fi| 2 ) and the hi waveform may not be recon- 
structable from the noisy data. 

In the coherent WaveBurst analysis we introduce a reg- 
ulator by modifying the norm of the f2 vector: 



(E15) 



where <5 is a tunable parameter. For example, if S = oo, 
the second GW component is entirely suppressed and the 
regulator corresponds to the "hard constraint" des cribe d 
in Ref. 40J. In this case the unit vector u (see Eq. ( E13 1 ) 
is pointing along the f+ direction. In the cWB analysis 
the parameter 6 is chosen to be 



0.01 



(E16) 



This regulator is more stringent for weak events which 
are generated by the pipeline at much higher rate than 
the loud events. 

The introduction of the regulator creates an obvious 
problem for the construction of the projection matrix. 



6. Coherent statistics 

When the detector noise is Gaussian and stationary, 
the maximum likelihood L max is the only statistic re- 
quired for detection and selection of the GW events. In 
this case the false alarm and the false dismissal proba- 
bilities are controlled by the threshold on L max which is 
an estimator of the total SNR detected by the network. 
However, the real data are contaminated with instrumen- 
tal and environmental artifacts and additional selection 
cuts should be applied to separate them from genuine 
GW signals [28]. In the coherent WaveBurst method 
these selection cuts are based on coherent statistics con- 
structed from the elements of the likelihood and the null 
matrices. The diagonal terms of the matrix L rnn de- 
scribe the reconstructed normalized incoherent energy. 
The sum of the off-diagonal terms is the coherent energy 
Ecoh detected by the network. 

The next step is to optimize the solution over the sky. 
Often, depending on the network configuration, the re- 
construction of source coordinates is ambiguous. For ex- 
ample, for two separated detectors the relative time de- 
lay that yields maximum correlation between the data 
streams corresponds to an annulus on the sky. In this 
case, an "optimal" source location is selected, where the 
reconstructed detector responses are the most consistent 
with the output detector data streams. To properly ac- 
count for the directional sensitivity of the network the 
optimization over sky locations has to be more than a 
simple maximization of £ max (#, (f>). In the cWB analysis 
the statistic that is maximized has the form 



L s ky(0, 4>) = imax Pf CC, 



(E17) 



where Pf is the penalty factor and cc is the network cor- 
relation coefficient. Pf and cc are defined below in terms 
of the matrix L mn = J2n TF w n w m P nm and the diago- 
nal matrices E nm = E n S nm and H nm = H n S nm which 
describe the normalized energy in the detectors, and the 
normalized reconstructed signal energy (see Eq. (E14|), 
with 



E, 



E 



(E18) 



Ideally, the reconstructed signal energy in each detec- 
tor Hk should not significantly exceed the energy E^. 
This requirement can be enforced by the constraint 



A fe = w ^k - (I = 



(E19) 
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for each detector in the network. These constraints can 
be applied during the signal reconstruction by way of 
Lagrange multipliers in the variational analysis, however 
this greatly increases the computational complexity of 
the algorithm. A simpler alternative is to introduce a 
penalty factor Pf that penalizes sky locations violating 
the constraint equation (E19): 



Pf 




(E20) 



In addition to serving as a penalty factor in the position 
reconstruction, the ratio of reconstructed and detector 
energy were also used as a post-production cut. Events 
with Pf < 0.6 were discarded, as were events with large 
values of the network energy disbalance 



A 



NET 



E 



JAfcl 
E rn h 



and the H1-H2 energy disbalance 



I A m - A ff2 | 



E, 



coh 



(E21) 



(E22) 



The latter cut was found to be particularly effective at 
rejecting correlated glitches in the two Hanford interfer- 
ometers. 

The network correlation coefficient is also used to 
weight the overall likelihood for each sky location. It 
is defined as 



cc 



\E, 



coh 



(E23) 



where iV nu n is the sum of all elements in the null matrix 

■Nnm — E nm L nrn , (E24) 

which represents the normalized energy of the recon- 
structed noise. Usually for glitches little coherent energy 
is detected and the reconstructed detector responses are 
inconsistent with the detector output, which results in a 
large value for the null energy. In addition to helping se- 
lect the optimal sky location, the correlation coefficients 
cc are used for a signal consistency test based on the 
comparison of the null energy and the coherent energy. 

The coherent terms of the likelihood matrix can be also 
used to calculate the correlation coefficients 

(E25) 



which represent Pearson's correlation coefficients in the 
case of aligned detectors. We use the coefficients r nra to 
construct the reduced coherent energy 



(E26) 



Combined with the network correlation coefficient cc and 
the number of detectors in the network, K, it yields a 
quantity which we call the coherent network amplitude, 



V 



^corr 

K 



(E27) 



Figure 13 shows the if-cc distribution of the background 
events (see Sec. VI) and simulated GW events (sec 
Sec. VII I for the L1H1H2 network. Loud background 



events due to detector glitches with low values of the net- 
work correlation coefficient are rejected by a threshold on 
cc. Relatively weak background events are rejected by a 
threshold on r\. Table [VT] describes the full set of tuning 
parameters for cWB. 



TABLE VI: Cuts used by the coherent WaveBurst pipeline in 
the first year of S5. The parameters are: network correlation 
coefficient cc, likelihood penalty factor Pf, coherent network 
amplitude n, H1-H2 energy disbalance Ahh, and network en- 
ergy disbalance Anet- Time-dependent cuts are noted with 
UTC times. 

H1H2L1 Network 

cc > 0.6 , P f > 0.6 

?7 > 5.7 for /<200 Hz, up to Dec 12 2005 03:19:29 

or after Oct 25 2006 09:34:17 
?7 > 5.2 for /<200 Hz, between Dec 12 2005 03:19:29 

and Oct 25 2006 09:34:17 

77 > 4.25 for />200 Hz 

Ahh < 0-3 , Anet < 0-35 

H1H2 Network 

cc > 0.65 , Pf > 0.6 
77 > 5.7 for /<200 Hz 

77 > 4.6 for />200 Hz, up to Jul 17 2006 11:50:37 
77 > 4.25 for />200 Hz, after Jul 17 2006 11:50:37 

Ahh < 0-3 , A NE t < 0-35 

H1L1 Network 

cc > 0.6 , Pf > 0.6 

77 > 6.5 for /<200 Hz, up to Oct 07 2006 08:58:06 
77 > 9.0 for /<200 Hz, after Oct 07 2006 08:58:06 
77 > 5.0 for />200 Hz 

Anet < 0-35 

H2L1 Network 

cc > 0.6 , Pf > 0.6 

77 > 6.5 for /<200 Hz, up to Mar 28 2006 04:23:06 

or after Oct 28 2006 11:54:46 
77 > 5.0 for /<200 Hz, between Mar 28 2006 04:23:06 

and Oct 28 2006 11:54:46 

77 > 5.0 for />200 Hz 
Anet < 0.35 
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FIG. 13: Coherent network amplitude r\ [defined in Eg. | E27l] 
versus network correlation coefficient cc [defined in ( E23 1 J for 
cWB triggers below 200 Hz in the H1H2L1 network. The 
black dots represent the noise triggers while the gray shadows 
represent the distribution of a set of simulated GWs injected 
into the data. The horizontal and vertical bars represent the 
cuts on rj and cc. 
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